4 research outputs found

    Doctor of Philosophy

    Get PDF
    dissertationThe objective of this work is to examine the efficacy of natural language processing (NLP) in summarizing bibliographic text for multiple purposes. Researchers have noted the accelerating growth of bibliographic databases. Information seekers using traditional information retrieval techniques when searching large bibliographic databases are often overwhelmed by excessive, irrelevant data. Scientists have applied natural language processing technologies to improve retrieval. Text summarization, a natural language processing approach, simplifies bibliographic data while filtering it to address a user's need. Traditional text summarization can necessitate the use of multiple software applications to accommodate diverse processing refinements known as "points-of-view." A new, statistical approach to text summarization can transform this process. Combo, a statistical algorithm comprised of three individual metrics, determines which elements within input data are relevant to a user's specified information need, thus enabling a single software application to summarize text for many points-of-view. In this dissertation, I describe this algorithm, and the research process used in developing and testing it. Four studies comprised the research process. The goal of the first study was to create a conventional schema accommodating a genetic disease etiology point-of-view, and an evaluative reference standard. This was accomplished through simulating the task of secondary genetic database curation. The second study addressed the development iv and initial evaluation of the algorithm, comparing its performance to the conventional schema using the previously established reference standard, again within the task of secondary genetic database curation. The third and fourth studies evaluated the algorithm's performance in accommodating additional points-of-view in a simulated clinical decision support task. The third study explored prevention, while the fourth evaluated performance for prevention and drug treatment, comparing results to a conventional treatment schema's output. Both summarization methods identified data that were salient to their tasks. The conventional genetic disease etiology and treatment schemas located salient information for database curation and decision support, respectively. The Combo algorithm located salient genetic disease etiology, treatment, and prevention data, for the associated tasks. Dynamic text summarization could potentially serve additional purposes, such as consumer health information delivery, systematic review creation, and primary research. This technology may benefit many user groups

    Opioid use and opioid use disorder in mono and dual-system users of veteran affairs medical centers

    Get PDF
    IntroductionEfforts to achieve opioid guideline concordant care may be undermined when patients access multiple opioid prescription sources. Limited data are available on the impact of dual-system sources of care on receipt of opioid medications.ObjectiveWe examined whether dual-system use was associated with increased rates of new opioid prescriptions, continued opioid prescriptions and diagnoses of opioid use disorder (OUD). We hypothesized that dual-system use would be associated with increased odds for each outcome.MethodsThis retrospective cohort study was conducted using Veterans Administration (VA) data from two facilities from 2015 to 2019, and included active patients, defined as Veterans who had at least one encounter in a calendar year (2015–2019). Dual-system use was defined as receipt of VA care as well as VA payment for community care (non-VA) services. Mono users were defined as those who only received VA services. There were 77,225 dual-system users, and 442,824 mono users. Outcomes were three binary measures: new opioid prescription, continued opioid prescription (i.e., received an additional opioid prescription), and OUD diagnosis (during the calendar year). We conducted a multivariate logistic regression accounting for the repeated observations on patient and intra-class correlations within patients.ResultsDual-system users were significantly younger than mono users, more likely to be women, and less likely to report white race. In adjusted models, dual-system users were significantly more likely to receive a new opioid prescription during the observation period [Odds ratio (OR) = 1.85, 95% confidence interval (CI) 1.76–1.93], continue prescriptions (OR = 1.24, CI 1.22–1.27), and to receive an OUD diagnosis (OR = 1.20, CI 1.14–1.27).DiscussionThe prevalence of opioid prescriptions has been declining in the US healthcare systems including VA, yet the prevalence of OUD has not been declining at the same rate. One potential problem is that detailed notes from non-VA visits are not immediately available to VA clinicians, and information about VA care is not readily available to non-VA sources. One implication of our findings is that better health system coordination is needed. Even though care was paid for by the VA and presumably closely monitored, dual-system users were more likely to have new and continued opioid prescriptions

    Identifying suicide documentation in clinical notes through zero‐shot learning

    No full text
    Abstract Background and Aims In deep learning, a major difficulty in identifying suicidality and its risk factors in clinical notes is the lack of training samples given the small number of true positive instances among the number of patients screened. This paper describes a novel methodology that identifies suicidality in clinical notes by addressing this data sparsity issue through zero‐shot learning. Our general aim was to develop a tool that leveraged zero‐shot learning to effectively identify suicidality documentation in all types of clinical notes. Methods US Veterans Affairs clinical notes served as data. The training data set label was determined using diagnostic codes of suicide attempt and self‐harm. We used a base string associated with the target label of suicidality to provide auxiliary information by narrowing the positive training cases to those containing the base string. We trained a deep neural network by mapping the training documents’ contents to a semantic space. For comparison, we trained another deep neural network using the identical training data set labels, and bag‐of‐words features. Results The zero‐shot learning model outperformed the baseline model in terms of area under the curve, sensitivity, specificity, and positive predictive value at multiple probability thresholds. In applying a 0.90 probability threshold, the methodology identified notes documenting suicidality but not associated with a relevant ICD‐10‐CM code, with 94% accuracy. Conclusion This method can effectively identify suicidality without manual annotation

    Answer ALS, a large-scale resource for sporadic and familial ALS combining clinical and multi-omics data from induced pluripotent cell lines.

    No full text
    Answer ALS is a biological and clinical resource of patient-derived, induced pluripotent stem (iPS) cell lines, multi-omic data derived from iPS neurons and longitudinal clinical and smartphone data from over 1,000 patients with ALS. This resource provides population-level biological and clinical data that may be employed to identify clinical-molecular-biochemical subtypes of amyotrophic lateral sclerosis (ALS). A unique smartphone-based system was employed to collect deep clinical data, including fine motor activity, speech, breathing and linguistics/cognition. The iPS spinal neurons were blood derived from each patient and these cells underwent multi-omic analytics including whole-genome sequencing, RNA transcriptomics, ATAC-sequencing and proteomics. The intent of these data is for the generation of integrated clinical and biological signatures using bioinformatics, statistics and computational biology to establish patterns that may lead to a better understanding of the underlying mechanisms of disease, including subgroup identification. A web portal for open-source sharing of all data was developed for widespread community-based data analytics
    corecore